Quantifying the Effects of Mask Metadata Disclosure and Multiple Releases on the Confidentiality of Geographically Masked Health Data
نویسندگان
چکیده
The availability of individual-level health data presents opportunities for monitoring the distribution and spread of emergent, acute, and chronic conditions, as well as challenges with respect to maintaining the anonymity of persons with health conditions. Particularly when such data are mapped as point locations, concerns arise regarding the ease with which individual identities may be determined by linking geographic coordinates to digital street networks, then determining residential addresses and, finally, names of occupants at specific addresses. The utility of such datasets must therefore be balanced against the requirements of protecting the confidentiality of individuals whose identities might be revealed through the availability of precise and accurate locational data. Recent literature has pointed towards geographic masking as a means for striking an appropriate balance between data utility and confidentiality. However, questions remain as to whether certain characteristics of the mask (mask metadata) should be disclosed to data users and whether two or more distinct masked versions of the data can be released without breaching confidentiality. In this article, we address these questions by quantifying the extent to which the disclosure of mask metadata and the release of multiple masked versions may affect confidentiality, with a view towards providing guidance to custodians of health datasets. The masks considered include perturbation, areal aggregation, and their combination. Confidentiality is measured by the areas of confidence regions for individuals’ locations, which are derived under the probability models governing the masks, conditioned on the disclosed mask metadata.
منابع مشابه
Aggregation methods to evaluate multiple protected versions of the same confidential data set
This work is about disclosure risk for national statistical offices and, more particularly, for the case of releasing multiple protected versions of the same micro-data files. This is, several copies of a single original data file are released to several data users. Each user receives a protected copy, and the masking method for each copy is selected according to the research interests of the u...
متن کاملطراحی الگوی اصول محرمانگی اطلاعات پرونده سلامت الکترونیک برای ایران - 1386
Introduction: Today increasing growing of health information is results in applying of new technologies for suite manages and utilization of information technologies such as electronic health record. A growing capacity of information technologies in collection, storage and transmission of information has added a great deal of concerns since electronic records can be accessed by numerous consume...
متن کاملرازداری پزشکی؛ مطالعه تطبیقی میان اصول اخلاق پزشکی و آموزههای اخلاق اسلامی
Confidentiality is one of the oldest principles of the medical profession that impacts on the relationship between physician and patient, the personal interests of patient and physician and consequently social welfare. While emphasizing the necessity of confidentiality, religious teachings consider disclosure of others' secrets a sin that deserves punishment thereafter. Nowadays, medical develo...
متن کاملA Method for Protecting Access Pattern in Outsourced Data
Protecting the information access pattern, which means preventing the disclosure of data and structural details of databases, is very important in working with data, especially in the cases of outsourced databases and databases with Internet access. The protection of the information access pattern indicates that mere data confidentiality is not sufficient and the privacy of queries and accesses...
متن کاملA Risk-Utility Framework for Categorical Data Swapping
Data swapping is a statistical disclosure limitation method used to protect the confidentiality of data by interchanging variable values between records. We propose a risk-utility framework for selecting an optimal swapped data release when considering several swap variables and multiple swap rates. Risk and utility values associated with each such swapped data file are traded off along a front...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006